Go back to the Preprocessing page. This link might be useful to keep track of the files created during the preprocessing.

Let us set some global options for all code chunks in this document.

knitr::opts_chunk$set(
  message = FALSE,    # Disable messages printed by R code chunks
  warning = FALSE,    # Disable warnings printed by R code chunks
  echo = TRUE,        # Show R code within code chunks in output
  include = TRUE,     # Include both R code and its results in output
  eval = TRUE,       # Evaluate R code chunks
  cache = FALSE,       # Enable caching of R code chunks for faster rendering
  fig.align = "center",
  out.width = "100%",
  retina = 2,
  error = TRUE,
  collapse = FALSE
)
rm(list = ls())
set.seed(1982)

1 Import libraries

library(sf)
library(jsonlite)
library(dplyr)

library(here)
library(rmarkdown)
library(listviewer) # to use jsonedit()
library(grateful) # Cite all loaded packages

rm(list = ls()) # Clear the workspace
set.seed(1982) # Set seed for reproducibility

2 Load the data

# read shape file
tomtom_shp <- st_read(here("data_files/San_Francisco_data_from_TomTom/network.shp"), quiet = TRUE)

# read json file and select the information of interest
raw_json <- fromJSON(here("data_files/San_Francisco_data_from_TomTom.json"))

2.1 Explore the data

# Displaying the structure of the nested list
jsonedit(raw_json)
tomtom_shp |> head(5) |> paged_table()
tomtom_shp |> dim()
## [1] 67199     8

3 Extract the relevant data

# Extract the relevant data from the JSON file
tomtom_json <- raw_json$network$segmentResults$segmentTimeResults
# Initialize empty data frames for 10 and 12 columns
df_10_columns <- data.frame()
df_12_columns <- data.frame()

# Iterate through the list of data frames
for (i in seq_along(tomtom_json)) {
  # Check the number of columns in the current data frame
  num_cols <- ncol(tomtom_json[[i]])
  
  # Add list number column
  tomtom_json[[i]]$List_Number <- i
  
  # Append to the appropriate data frame based on the number of columns
  if (num_cols == 10) {
    df_10_columns <- rbind(df_10_columns, tomtom_json[[i]])
  } else if (num_cols == 12) {
    df_12_columns <- rbind(df_12_columns, tomtom_json[[i]])
  }
}

# add two columns with NA values so that we can rbin later
from_10_to_12 = df_10_columns %>% mutate(standardDeviationSpeed = NA, travelTimeStandardDeviation = NA)
  
# rbind and  order by List_Number
almost_tomtom = rbind(from_10_to_12, df_12_columns) %>% arrange(List_Number)

# join shape file and the dataset we get from the above process
casi_tomtom = bind_cols(tomtom_shp, almost_tomtom) %>% 
  mutate(FRC = as.character(FRC))

PERCENTILES = do.call(rbind, casi_tomtom$speedPercentiles) %>% as.data.frame()
names(PERCENTILES) =  paste(seq(5, 95, by = 5), "percentile", sep = "")

tomtom = bind_cols(casi_tomtom, PERCENTILES) %>% dplyr::select(-speedPercentiles)


# save the obtained data set
save(tomtom, file = here("data_files/tomtom.RData"))

3.1 Explore the data

tomtom |> head(5) |> paged_table()
tomtom |> dim()
## [1] 67199    39

4 Description of the Functional Road Classes (FRC) values

The Functional Road Classes (FRC) values are used to classify roads based on their importance and usage. The table below (taken from this page) provides a description of the different FRC values. We remark this classification because later this will determine the size of the graph.

# Load necessary library
library(knitr)

# Create the table data
table_data <- data.frame(
  `FRC VALUE` = c(0, 1, 2, 3, 4, 5, 6, 7, 8),
  `Short Description` = c(
    "Motorways; Freeways; Major Roads",
    "Major Roads less important than Motorways",
    "Other Major Roads",
    "Secondary Roads",
    "Local Connecting Roads",
    "Local Roads of High Importance",
    "Local Roads",
    "Local Roads of Minor Importance",
    "Other Roads"
  ),
  `Long Description` = c(
    "All roads that are officially assigned as motorways.",
    "All roads of high importance, but not officially assigned as motorways, that are part of a connection used for international and national traffic and transport.",
    "All roads used to travel between different neighboring regions of a country.",
    "All roads used to travel between different parts of the same region.",
    "All roads making all settlements accessible or making parts (north, south, east, west, and central) of a settlement accessible.",
    "All local roads that are the main connections in a settlement. These are the roads where important through traffic is possible e.g.,: • arterial roads within suburban areas, industrial areas or residential areas, • a rural road, which has the sole function of connecting to a national park or important tourist attraction.",
    "All roads used to travel within a part of a settlement or roads of minor connecting importance in a rural area.",
    "All roads that only have a destination function, e.g., dead-end roads, roads inside a living area, alleys: narrow roads between buildings, in a park or garden.",
    "All other roads that are less important for a navigation system: • a path: a road that is too small to be driven by a passenger car, • bicycle paths or footpaths that are especially designed as such, • stairs, • pedestrian tunnel, • pedestrian bridge, • alleys that are too small to be driven by a passenger car."
  )
)

# Print the table
kable(table_data, align = "cll", 
      caption = "Description of the Functional Road Classes (FRC) values.", 
      col.names = c("FRC VALUE", "Short Description", "Long Description"))
Description of the Functional Road Classes (FRC) values.
FRC VALUE Short Description Long Description
0 Motorways; Freeways; Major Roads All roads that are officially assigned as motorways.
1 Major Roads less important than Motorways All roads of high importance, but not officially assigned as motorways, that are part of a connection used for international and national traffic and transport.
2 Other Major Roads All roads used to travel between different neighboring regions of a country.
3 Secondary Roads All roads used to travel between different parts of the same region.
4 Local Connecting Roads All roads making all settlements accessible or making parts (north, south, east, west, and central) of a settlement accessible.
5 Local Roads of High Importance All local roads that are the main connections in a settlement. These are the roads where important through traffic is possible e.g.,: • arterial roads within suburban areas, industrial areas or residential areas, • a rural road, which has the sole function of connecting to a national park or important tourist attraction.
6 Local Roads All roads used to travel within a part of a settlement or roads of minor connecting importance in a rural area.
7 Local Roads of Minor Importance All roads that only have a destination function, e.g., dead-end roads, roads inside a living area, alleys: narrow roads between buildings, in a park or garden.
8 Other Roads All other roads that are less important for a navigation system: • a path: a road that is too small to be driven by a passenger car, • bicycle paths or footpaths that are especially designed as such, • stairs, • pedestrian tunnel, • pedestrian bridge, • alleys that are too small to be driven by a passenger car.

5 References

cite_packages(output = "paragraph", out.dir = ".")

We used R version 4.4.0 (R Core Team 2024) and the following R packages: here v. 1.0.1 (Müller 2020), htmltools v. 0.5.8.1 (Cheng et al. 2024), INLA v. 24.6.27 (Rue, Martino, and Chopin 2009; Lindgren, Rue, and Lindström 2011; Martins et al. 2013; Lindgren and Rue 2015; De Coninck et al. 2016; Rue et al. 2017; Verbosio et al. 2017; Bakka et al. 2018; Kourounis, Fuchs, and Schenk 2018), inlabru v. 2.10.1.9010 (Yuan et al. 2017; Bachl et al. 2019), knitr v. 1.47 (Xie 2014, 2015, 2024), listviewer v. 4.0.0 (de Jong, Gainer, and Russell 2023), mapview v. 2.11.2 (Appelhans et al. 2023), MetricGraph v. 1.3.0.9000 (Bolin, Simas, and Wallin 2023b, 2023a, 2023c, 2024; Bolin et al. 2023), patchwork v. 1.2.0 (Pedersen 2024), plotly v. 4.10.4 (Sievert 2020), rmarkdown v. 2.27 (Xie, Allaire, and Grolemund 2018; Xie, Dervieux, and Riederer 2020; Allaire et al. 2024), rSPDE v. 2.3.3.9000 (Bolin and Kirchner 2020; Bolin and Simas 2023; Bolin, Simas, and Xiong 2023), scales v. 1.3.0 (Wickham, Pedersen, and Seidel 2023), sf v. 1.0.16 (Pebesma 2018; Pebesma and Bivand 2023), tidyverse v. 2.0.0 (Wickham et al. 2019), TSstudio v. 0.1.7 (Krispin 2023), xaringanExtra v. 0.8.0 (Aden-Buie and Warkentin 2024).

Aden-Buie, Garrick, and Matthew T. Warkentin. 2024. xaringanExtra: Extras and Extensions for xaringan Slides. https://CRAN.R-project.org/package=xaringanExtra.
Allaire, JJ, Yihui Xie, Christophe Dervieux, Jonathan McPherson, Javier Luraschi, Kevin Ushey, Aron Atkins, et al. 2024. rmarkdown: Dynamic Documents for r. https://github.com/rstudio/rmarkdown.
Appelhans, Tim, Florian Detsch, Christoph Reudenbach, and Stefan Woellauer. 2023. mapview: Interactive Viewing of Spatial Data in r. https://CRAN.R-project.org/package=mapview.
Bachl, Fabian E., Finn Lindgren, David L. Borchers, and Janine B. Illian. 2019. inlabru: An R Package for Bayesian Spatial Modelling from Ecological Survey Data.” Methods in Ecology and Evolution 10: 760–66. https://doi.org/10.1111/2041-210X.13168.
Bakka, Haakon, Håvard Rue, Geir-Arne Fuglstad, Andrea I. Riebler, David Bolin, Janine Illian, Elias Krainski, Daniel P. Simpson, and Finn K. Lindgren. 2018. “Spatial Modelling with INLA: A Review.” WIRES (Invited Extended Review) xx (Feb): xx–. http://arxiv.org/abs/1802.06350.
Bolin, David, and Kristin Kirchner. 2020. “The Rational SPDE Approach for Gaussian Random Fields with General Smoothness.” Journal of Computational and Graphical Statistics 29 (2): 274–85. https://doi.org/10.1080/10618600.2019.1665537.
Bolin, David, Mihály Kovács, Vivek Kumar, and Alexandre B. Simas. 2023. “Regularity and Numerical Approximation of Fractional Elliptic Differential Equations on Compact Metric Graphs.” Mathematics of Computation. https://doi.org/10.1090/mcom/3929.
Bolin, David, and Alexandre B. Simas. 2023. rSPDE: Rational Approximations of Fractional Stochastic Partial Differential Equations. https://CRAN.R-project.org/package=rSPDE.
Bolin, David, Alexandre B. Simas, and Jonas Wallin. 2023a. “Markov Properties of Gaussian Random Fields on Compact Metric Graphs.” arXiv Preprint arXiv:2304.03190. https://doi.org/10.48550/arXiv.2304.03190.
———. 2023b. MetricGraph: Random Fields on Metric Graphs. https://CRAN.R-project.org/package=MetricGraph.
———. 2023c. “Statistical Inference for Gaussian Whittle-Matérn Fields on Metric Graphs.” arXiv Preprint arXiv:2304.10372. https://doi.org/10.48550/arXiv.2304.10372.
———. 2024. “Gaussian Whittle-Matérn Fields on Metric Graphs.” Bernoulli 30 (2): 1611–39. https://doi.org/10.3150/23-BEJ1647.
Bolin, David, Alexandre B. Simas, and Zhen Xiong. 2023. “Covariance-Based Rational Approximations of Fractional SPDEs for Computationally Efficient Bayesian Inference.” Journal of Computational and Graphical Statistics. https://doi.org/10.1080/10618600.2023.2231051.
Cheng, Joe, Carson Sievert, Barret Schloerke, Winston Chang, Yihui Xie, and Jeff Allen. 2024. htmltools: Tools for HTML. https://CRAN.R-project.org/package=htmltools.
De Coninck, Arne, Bernard De Baets, Drosos Kourounis, Fabio Verbosio, Olaf Schenk, Steven Maenhout, and Jan Fostier. 2016. Needles: Toward Large-Scale Genomic Prediction with Marker-by-Environment Interaction.” Genetics 203 (1): 543–55. https://doi.org/10.1534/genetics.115.179887.
de Jong, Jos, Mac Gainer, and Kent Russell. 2023. listviewer: htmlwidget for Interactive Views of r Lists. https://CRAN.R-project.org/package=listviewer.
Kourounis, D., A. Fuchs, and O. Schenk. 2018. “Towards the Next Generation of Multiperiod Optimal Power Flow Solvers.” IEEE Transactions on Power Systems PP (99): 1–10. https://doi.org/10.1109/TPWRS.2017.2789187.
Krispin, Rami. 2023. TSstudio: Functions for Time Series Analysis and Forecasting. https://CRAN.R-project.org/package=TSstudio.
Lindgren, Finn, and Håvard Rue. 2015. “Bayesian Spatial Modelling with R-INLA.” Journal of Statistical Software 63 (19): 1–25. http://www.jstatsoft.org/v63/i19/.
Lindgren, Finn, Håvard Rue, and Johan Lindström. 2011. “An Explicit Link Between Gaussian Fields and Gaussian Markov Random Fields: The Stochastic Partial Differential Equation Approach (with Discussion).” Journal of the Royal Statistical Society B 73 (4): 423–98.
Martins, Thiago G., Daniel Simpson, Finn Lindgren, and Håvard Rue. 2013. “Bayesian Computing with INLA: New Features.” Computational Statistics and Data Analysis 67: 68–83.
Müller, Kirill. 2020. here: A Simpler Way to Find Your Files. https://CRAN.R-project.org/package=here.
Pebesma, Edzer. 2018. Simple Features for R: Standardized Support for Spatial Vector Data.” The R Journal 10 (1): 439–46. https://doi.org/10.32614/RJ-2018-009.
Pebesma, Edzer, and Roger Bivand. 2023. Spatial Data Science: With applications in R. Chapman and Hall/CRC. https://doi.org/10.1201/9780429459016.
Pedersen, Thomas Lin. 2024. patchwork: The Composer of Plots. https://CRAN.R-project.org/package=patchwork.
R Core Team. 2024. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/.
Rue, Håvard, Sara Martino, and Nicholas Chopin. 2009. “Approximate Bayesian Inference for Latent Gaussian Models Using Integrated Nested Laplace Approximations (with Discussion).” Journal of the Royal Statistical Society B 71: 319–92.
Rue, Håvard, Andrea I. Riebler, Sigrunn H. Sørbye, Janine B. Illian, Daniel P. Simpson, and Finn K. Lindgren. 2017. “Bayesian Computing with INLA: A Review.” Annual Reviews of Statistics and Its Applications 4 (March): 395–421. http://arxiv.org/abs/1604.00860.
Sievert, Carson. 2020. Interactive Web-Based Data Visualization with r, Plotly, and Shiny. Chapman; Hall/CRC. https://plotly-r.com.
Verbosio, Fabio, Arne De Coninck, Drosos Kourounis, and Olaf Schenk. 2017. “Enhancing the Scalability of Selected Inversion Factorization Algorithms in Genomic Prediction.” Journal of Computational Science 22 (Supplement C): 99–108. https://doi.org/10.1016/j.jocs.2017.08.013.
Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.
Wickham, Hadley, Thomas Lin Pedersen, and Dana Seidel. 2023. scales: Scale Functions for Visualization. https://CRAN.R-project.org/package=scales.
Xie, Yihui. 2014. knitr: A Comprehensive Tool for Reproducible Research in R.” In Implementing Reproducible Computational Research, edited by Victoria Stodden, Friedrich Leisch, and Roger D. Peng. Chapman; Hall/CRC.
———. 2015. Dynamic Documents with R and Knitr. 2nd ed. Boca Raton, Florida: Chapman; Hall/CRC. https://yihui.org/knitr/.
———. 2024. knitr: A General-Purpose Package for Dynamic Report Generation in r. https://yihui.org/knitr/.
Xie, Yihui, J. J. Allaire, and Garrett Grolemund. 2018. R Markdown: The Definitive Guide. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown.
Xie, Yihui, Christophe Dervieux, and Emily Riederer. 2020. R Markdown Cookbook. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown-cookbook.
Yuan, Yuan, Bachl, Fabian E., Lindgren, Finn, Borchers, et al. 2017. “Point Process Models for Spatio-Temporal Distance Sampling Data from a Large-Scale Survey of Blue Whales.” Ann. Appl. Stat. 11 (4): 2270–97. https://doi.org/10.1214/17-AOAS1078.